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Abstract — Goal: A new method for heart rate monitoring 
using photoplethysmography (PPG) during physical activities is 
proposed. Methods-. It jointly estimates spectra of PPG signals 
and simultaneous acceleration signals, utilizing the multiple 
measurement vector model in sparse signal recovery. Due to a 
common sparsity constraint on spectral coefficients, the method 
can easily identify and remove spectral peaks of motion artifact 
(MA) in PPG spectra. Thus, it does not need any extra signal 
processing modular to remove MA as in some other algorithms. 
Furthermore, seeking spectral peaks associated with heart rate 
is simplified. Results : Experimental results on 12 PPG datasets 
sampled at 25 Hz and recorded during subjects’ fast running 
showed that it had high performance. The average absolute 
estimation error was 1.28 beat per minute and the standard 
deviation was 2.61 beat per minute. Conclusion and Significance-. 
These results show that the method has great potential to be 
used for PPG-based heart rate monitoring in wearable devices 
for fitness tracking and health monitoring. 

Index Terms —Photoplethysmography (PPG), Heart Rate Mon¬ 
itoring, Sparse Signal Recovery, Compressed Sensing, Wearable 
Healthcare, Health Monitoring, Fitness Tracking 


I. Introduction 

With the emerging of smart-watches and smart wristbands 
for healthcare and fitness, heart rate (HR) monitoring using 
photoplethysmography (PPG) [1], [2] recorded from wearers’ 
wrists becomes a new research topic both in industry and 
academia. HR monitoring using PPG signals has many advan¬ 
tages over traditional ECG signals, such as simpler hardware 
implementation, lower cost, and no requirement of ground 
sensors and reference sensors as in ECG recordings. 

However, during physical activities motion artifact (MA) 
contaminated in PPG signals seriously interferes with HR 
estimation. The MA is mainly caused by ambient light leaking 
into the gap between a PPG sensor surface and skin surface. 
The gap can be easily enlarged by hand movements during 
physical activities. Besides, the change in blood flow due to 
movements is another MA source [3]. Consequently, the PPG 
signal contains strong MA, while the heartbeat-related PPG 
component is weak. Therefore, estimating heart rate becomes 
very challenging. In [4], [5], several typical difficulties in HR 
estimation are discussed. 
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So far, many noise-reduction techniques have been pro¬ 
posed, such as independent component analysis (ICA) [6], 
adaptive noise cancelation (ANC) [7], [8], spectrum subtrac¬ 
tion [9], Kalman filtering [10], wavelet denoising [11], and 
empirical mode decomposition [12], to name a few. But these 
techniques were mainly proposed for scenarios when MA is 
not strong. 

The TROIKA framework [4] is a recently proposed method 
for scenarios when MA is extremely strong. It consists of 
signal decomposition, sparsity-based high-resolution spectrum 
estimation, and spectral peak tracking and verification. The 
signal decomposition aims to partially remove MA compo¬ 
nents which overlap with the heartbeat-related PPG compo¬ 
nents in the same frequency band, and also to sparsify the PPG 
spectra facilitating the sparsity-based high-resolution spectrum 
estimation. The spectral peak tracking with verification aims to 
correctly select the spectral peaks corresponding to HR, over¬ 
coming various unexpected situations such as nonexistence of 
HR-related spectral peaks. Experimental results have shown 
that the TROIKA framework has high performance during 
wearers’ intensive physical activities. 

In this paper a new approach to HR estimation is proposed, 
which is based on JOint Sparse Spectrum reconstruction, 
denoted by JOSS. It exploits the fact that the spectra of 
PPG signals and simultaneous acceleration signals have some 
common spectrum structures, and thus formulates the spectrum 
estimation of these signals into a joint sparse signal recovery 
model, called the multiple measurement vector (MMV) model 
[13]. Although the idea of using basic sparse signal recovery 
algorithms was initially proposed in TROIKA [4], the present 
approach is novel and has many advantages over TROIKA. 
The main novelty is the use of the MMV model for joint 
spectrum estimation, in contrast to the single measurement 
vector (SMV) model [4] in TROIKA. In the MMV model, the 
measurement vectors are PPG signals and acceleration signals. 
Thus, their spectra are estimated simultaneously. Using the 
MMV model has many benefits which cannot be obtained 
when using the SMV model: 

• Based on theoretical analysis, given the same sparsity 
level and compression ratio, the MMV model is known 
to have much better reconstruction performance than the 
SMV model [13]-[15], 

• Due to the common sparsity constraint of the MMV 
model [13] on the spectral coefficients, the spectral peaks 
of MA in the raw PPG spectra can be easily found by 
checking spectral peaks in the acceleration signal spectra 
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at corresponding frequency bins. 

• By comparing the spectral coefficients in the PPG spec¬ 
trum and the acceleration signal spectra at the same fre¬ 
quency bins, we can cleanse the PPG spectra, achieving 
a similar result as spectral subtraction [16]. 

Due to the above benefits, the proposed approach does not re¬ 
quire the signal decomposition and the temporal difference op¬ 
erations in TROIKA. Furthermore, the spectral peak tracking 
and verification in TROIKA can be largely simplified. Thus, 
the approach is more suitable for hardware implementation 

[17]. 

The rest of the paper is organized as follows. Section [TT| 
presents the approach. Section [HI] gives experimental results. 
Discussion and conclusion are given in the last two sections. 

II. Proposed Method 

This section first presents the motivations, and then proposes 
the JOSS method. The JOSS method has two parts. One part is 
joint sparse spectrum reconstruction using the MMV model, 
followed by a simple spectral subtraction. The second part 
is spectral peak tracking with verification. Note that before 
fed into JOSS, PPG signals and acceleration signals are first 
bandpass filtered. 

A. Motivations 

In [4] the following SMV model was used to estimate the 
sparse spectrum of a raw PPG signal, 

y = $x + v, (1) 

where y £ M Mx1 is a segment of a raw PPG signal, 3> £ 
C AIxN (M < N ) is a redundant discrete Fourier transform 
(DFT) basis, x £ C ;V x 1 is the desired solution vector, and 
v £ R Mx1 models measurement noise or modeling errors. 
The redundant DFT basis is given by 

$ ra ,n = e^ ron , to = 0, • • • , M — 1; n = 0, • • • ,N — 1 (2) 

where <t» m n denotes the (to, n)th entry of <!>. In the SMV 
model, a key assumption is that x is sparse or compressive, 
i.e. most elements of x are zero or nearly zero, while only a 
few elements have large nonzero values. 

Based on this model, the estimated fcth spectrum coefficient 
of the PPG signal, denoted by Sk, is thus given by 

Sk = \xk\ 2 , k = 1, • ■ • , N (3) 

where Xk is the fcth element of x £ C v x 1 , the estimate of x. 

Using sparse signal recovery algorithms to estimate spectra 
has advantages over conventional nonparametric spectrum 
estimation methods and line spectral estimation methods, such 
as high spectrum resolution, low estimation variance, and 
increased robustness [18]. 

In TROIKA [4] a raw PPG was first bandpass filtered and 
then cleansed by partially removing MA via a signal decompo¬ 
sition procedure. Then the cleansed PPG signal was processed 
by a temporal difference operation to further suppress low- 
frequency MA and enhance the harmonic components of 
heartbeat-related PPG components. Finally, the resulting signal 


was used to calculate its spectrum by an SMV-model-based 
sparse signal recovery algorithm. 

However, the signal decomposition procedure cannot re¬ 
move all prominent MA components. It has two drawbacks. 

First, in this procedure, a spectral peak of MA in a raw PPG 
spectrum is identified by checking if there is also a spectral 
peak in an acceleration spectrum at the same frequency 
bin. However, both the PPG spectrum and the acceleration 
spectrum are calculated using Periodogram separately. Con¬ 
sequently, even caused by a common hand movement, the 
spectral peak in the PPG spectrum and the counterpart peak 
in the acceleration spectrum may not appear at the same 
frequency bin. (They may locate at two frequency bins that 
are close to each other.) Thus, the MA spectral peak in the 
raw PPG spectrum is not identified, and will remain in the 
PPG spectrum calculated by sparse signal recovery in a later 
stage. 

Second, if an MA component has a dominant spectral peak 
locating close to the frequency bin at which the previously 
selected HR-related spectral peak locates, the MA component 
will be kept in the PPG signal. 

To overcome the drawbacks, this work proposes using 
MMV-model-based sparse signal recovery to estimate spectra 
of PPG and acceleration signals jointly. Based on this method, 
one can easily and reliably cleanse PPG spectra by removing 
MA spectral peaks, similar as spectral subtraction. 

B. Joint Sparse Spectrum Reconstruction Using the MMV 
Model 

The MMV model is an extension of the SMV model, 
estimating solution vectors jointly from multiple measurement 
vectors. In estimating power spectra of multichannel signals, 
the model can be expressed as follows 

Y = <FX + V, (4) 

where Y £ R Mxi is the matrix consisting of L measurement 
vectors, $ is the redundant DFT basis as before, X £ C NxL 
is the desired solution matrix, and V £ R Mxi represents 
measurement noise or model errors. A key assumption in the 
MMV model is that the solution matrix X is row-wise sparse, 
that is, only a few rows in X are nonzero while most rows 
are zero. This assumption is also referred to as the ‘common 
sparsity constraint’ [13]. 

To use the MMV model, the columns of the measurement 
matrix Y are segments of PPG and simultaneous acceleration 
signals in the same time window. In the experiments of this 
work, Y is formed by a segment of one channel of PPG 
and segments of three channels of simultaneous acceleration 
signals. Thus, L = 4. Each column of X yields the spectrum 
of the corresponding signal. 

Many sparse signal recovery algorithms have been proposed 
for this model [13], [15]. But not every algorithm is suitable, 
since here the matrix $ is highly coherent, i.e., high corre¬ 
lation exists between neighboring columns of T*. This work 
adopts the Regularized M-FOCUSS algorithm [13], because 
it has fast speed and reliable performance even if $ is highly 
coherent. 
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The joint sparse signal recovery has many advantages. Using 
the MMV model, we can recover unique solutions which 
are less sparse. For example, in [13] it is shown that the 
MMV model can ensure a unique solution with the number 
of nonzero rows ro satisfies 

r o < \{M + L)/2~\ — 1 (5) 

where [•] denotes the ceiling operation. Instead, the SMV 
model can only recover unique solutions with the nonzero 
coefficient number ro < \{M + l)/2] — 1. These theoretical 
results are inspiring. In the present application problem, the 
number of spectral coefficients sometimes is large due to MA. 
Although bandpass filtering is generally used before spectrum 
estimation, the spectral coefficients in the pass band are still 
many, which is an adverse situation. The TROIKA framework 
uses signal decomposition to remove MA components in the 
pass band to sparsify the spectrum, but this procedure is not 
always effective as stated in Section IH-AI In contrast, this 
situation is alleviated in the MMV model. 

Besides, given the same sparsity level (i.e., the number of 
nonzero rows of X in the MMV model equals to the number 
of nonzero coefficients of x in the SMV model) and other 
same conditions, the MMV model can yield solutions with 
smaller errors than the SMV model [14], [19]. 

In practical use, the common sparsity constraint of the 
MMV model helps identify spectral peaks of MA in PPG 
spectra by using the spectral peaks in acceleration spectra. 
Since MA components in PPG signals have many common 
frequencies with simultaneous acceleration signals, the com¬ 
mon sparsity constraint encourages the frequency locations of 
MA in PPG spectra to be aligned well with some frequency 
locations in acceleration spectra. Consequently, one can accu¬ 
rately remove the spectral peaks of MA in the PPG spectra. 
This benefit is more desirable when the heartbeat frequency is 
very close to a frequency of MA. 

Fig JT] shows an example. Taking a PPG signal and three 
simultaneous acceleration signals (shown in Fig|2]i to be the 
four measurement vectors in the MMV model, their spectra are 
jointly estimated, as shown in FigJT} a)-(d). Due to the common 
sparsity constraint on the estimated multichannel spectra, 
the frequency locations of MA in the PPG spectrum were 
exactly aligned with the frequency locations in the acceleration 
spectra. Thus, one can subtract the MA spectral peaks in 
the acceleration spectra from the PPG spectrum (refer to the 
next subsection for details), yielding the cleansed spectrum 
shown in FigjTJe). We can see there was only one spectral 
peak remained, locating at the 112th frequency bin, which 
exactly corresponded to the heartbeat frequency (estimated 
from simultaneous ECG). 

Note that in this example the heartbeat frequency is close 
to the frequencies of MA. However, the proposed method 
correctly distinguishes the spectral peaks of MA from the 
spectral peak of heartbeat. This advantage cannot be gained 
by using conventional spectrum estimation algorithms such as 
Periodogram. Fig[3] shows the results by using Periodogram 
and the same spectral subtraction. Due to the low-resolution 
of Periodogram and the leakage effect [4], the spectral peak 
corresponding to the heartbeat is incorrectly removed during 
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Fig. 1. Joint sparse signal recovery helps better identify and remove spectral 
peaks of MA in PPG spectra, (a)-(d) are sparse spectra of the PPG signal 
and three channels of acceleration signals in Fig[2] respectively, which are 
calculated using the MMV model, (e) is the final sparse spectrum of the 
PPG signal after spectral subtraction. The true heartbeat frequency (estimated 
from the ECG signal) locates at the 112th frequency bin, which is accurately 
detected in (e). Note that in this example the HR frequency is very close to 
an MA frequency. 


spectral subtraction. Instead, a false spectral peak is remained 
in the final spectrum, locating at the 116th frequency bin, 
indicating a large error of about 5.9 beat per minute (BPM). 

Due to the advantages of the MMV model, finding the 
spectral peak corresponding to heartbeat becomes easier. In the 
following subsections, a simple spectral subtraction method 
will be first described, and then a spectral peak tracking 
method is proposed. 

C. Spectral Subtraction 

Thanks to the advantages of the MMV model, spectral 
subtraction is made simple (given one PPG spectrum and three 
acceleration spectra): 

• Step 1: For each frequency bin /$(* = 1, * • • , N), choose 
the maximum value of spectral coefficients in acceleration 
spectra at fi, denoted by C t . 

• Step 2: For each frequency bin fi(i = l,--- ,N), 
subtract G, from the value of the spectral coefficient at 
fi in the PPG spectrum. Now we obtain a processed PPG 
spectrum. Denote by p max the maximum value of all 
coefficients in the PPG spectrum. 

• Step 3: Set to zero all spectral coefficients with values 
less than p max /4, yielding a cleansed PPG spectrum. 

To ensure the spectral subtraction method effective, each 
of the PPG segment and acceleration segments should be 
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Fig. 2. Segments of simultaneously recorded raw signals, (a) A segment of 
the raw ECG signal sampled at 125 Hz (which provides ground-truth of HR), 
(b) A segment of the PPG signal sampled at 25 Hz. (c)-(e) are segments of 
the acceleration signals at three channels sampled at 25 Hz. 


normalized to have the same variance (i.e., the same energy). 
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D. Spectral Peak Tracking 

As in TROIKA, spectral peak tracking is another key part. 
In this work, the proposed spectral peak tracking approach is 
simpler with less tuning parameters than the one in TROIKA. 
The approach is also based on the observation that HR values 
in two successive time windows are very close if the two time 
windows overlap largely. 

The spectral peak tracking method consists of four stages, 
namely initialization, peak selection, peak verification, and 
peak discovery. 

1) Initialization: In the initialization stage wearers are 
required to reduce hand motions as much as possible for 
several seconds, and HR is estimated by choosing the highest 
spectral peak in the PPG spectrum. 

In the proposed approach, the kurtosis of the PPG spectrum 
from 0.8 Hz to 2.5 Hz (corresponding to 48 BPM to 150 
BPM) is used to classify whether hand motions are reduced 
sufficiently. When hand motions occur, the kurtosis is small; 
otherwise, it is very large. Thus, in the proposed approach, 
if the kurtosis is larger than 10, the current time window is 
determined to be in the initialization stage, and the highest 
spectral peak in the PPG spectrum from 0.8 Hz to 2.5 Hz 
is chosen; in the next time window the approach enters the 
next stage. If the kurtosis is smaller than 10, the current time 
window is not in the initialization stage, and thus no HR 
estimate is output; in the next time window, the approach still 
checks whether it is in the initialization stage. 

2) Peak Selection: In this stage, the goal is to choose a 
peak in the PPG spectrum with the knowledge of estimated 
HR values in previous time windows. The flowchart is given 
in Fig0] 

First, a search range is set, denoted by R^. This search 
range is centered at the location of the previously estimated 


Fig. 3. Using Periodogram may result in large errors in HR estimation, (a)-(d) 
are spectra of the PPG signal and the three channels of acceleration signals in 
FigO respectively, which are calculated using the Periodogram separately, (e) 
is the final spectrum of the PPG signal after spectral subtraction as in Fig[[| 
The true heartbeat frequency (located at the 112th frequency bin) is missed. 
The nearest spectral peak locates at the 116th frequency bin, indicating the 
error is about 5.9 BPM (each frequency bin corresponds to about 1.4 BPM). 


spectral peak. In particular, denote by prevLoc the frequency 
bin corresponding to the previously estimated HR, and denote 
by prevBPM the corresponding BPM value. Then Ri = 
[prevLoc— A l5 prevLoc+ A]J, where A ; is a positive integer. 

Next, in the search range R\ we seek at most 3 spectral 
peaks. If find any, then choose the one closest to prevLoc. 
If not, then set another search range f ?2 = [prevLoc — 
A 2 , prevLoc + A 2 ] with A 2 > Ai. If in this range we still 
cannot find any peak, then output prevLoc and prevBPM. If 
find any, then select the one with the highest value. 

When selecting the peak closest to prevLoc, one may face a 
situation that two peaks have equal distance to prevLoc. (One 
has lower frequency and another has higher frequency.) To 
decide which peak should be selected, it is needed to predict 
whether current HR value will increase or decrease compared 
to the previously estimated HR value. This can be efficiently 
done by performing the Smoother algorithm in [20] on the 
spectral locations of H previously estimated HR values, where 
H > 10. The smoothing parameter in the algorithm should be 
set to a small value such that we can predict local change of 
HR values. 

Let curLoc and curBPM denote the frequency bin and 
associated BPM of the selected spectral peak, respectively. 

3) Peak Verification: The previous stage sometimes can 
wrongly select a spectral peak associated with MA. Thus the 
verification stage is necessary. The flowchart is given in FigJ3 
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Fig. 4. Flowchart of the peak selection procedure. 


Fig. 5. Flowchart of the peak verification procedure. 


The verification method is based on an observation that the 
change of BPM values in two successive time windows rarely 
exceeds 12 BPM. If the difference between curBPM and 
prevBPM is larger than 12 BPM, then it is highly possible 
that the selected spectral peak is incorrect. So, curLoc and 
curBPM are reset to prevLoc and prevBPM, respectively. 

However, when curLoc = prevLoc happens in multiple 
successive time windows, this indicates that the target spectral 
peak is lost. Thus, a discovery procedure is triggered. 

4) Peak Discovery: Fig [6] shows the flowchart of peak dis¬ 
covery. The Smoother algorithm is performed on the spectral 
locations of K previously estimated HR values to predict the 
spectral location of heartbeat in current time window. Different 
to the prediction in Peak Selection Stage, here the goal is to 
predict a macro-trend of HR evolution. So K is much larger 
than H, and the smoothing parameter is set to a large value. 
Since there is large uncertainty on the exact spectral location 
of HR, a larger search range is set than before. In particular, 
the search range is i ?3 = [predictLoc—A 3 , predictLoc+Aa], 
where predictLoc is the predicted location, and A 3 is a large 
integer. 

III. Experimental Results 

A. Datasets 

The JOSS algorithm was evaluated on the 12 PPG datasets 
initially used in [4] Q. Each dataset contains a channel of 

'The datasets are also the training sets of 2015 IEEE Signal Processing Cup: 
http://icassp2015.org/signal-processing-cup-2015/ 


PPG, three channels of acceleration signals, and a channel 
of ECG, all recorded simultaneously from 12 healthy male 
subjects with age ranging from 18 to 35. In each dataset, the 
PPG signal was recorded from the wrist (dorsal) using a pulse 
oximeter with green LED (wavelength: 515nm). The three- 
channel acceleration signal was also recorded from the wrist 
using a three-axis accelerometer. Both the pulse oximeter and 
the accelerometer were embedded in a wristband. The ECG 
signal was recorded from the chest using wet ECG electrodes, 
from which the ground-truth of HR was calculated to evaluate 
algorithm performance. All signals were sent to a nearby 
computer via Bluetooth. 

During data recording the subjects walked or ran on a 
treadmill with the following speeds in order: the speed of 1- 
2 km/hour for 0.5 minute, the speed of 6-8 km/hour for 1 
minute, the speed of 12-15 km/hour for 1 minute, the speed 
of 6-8 km/hour for 1 minutes, the speed of 12-15 km/hour for 
1 minute, and the speed of 1-2 km/hour for 0.5 minute. 

All signals were initially sampled at 125 Hz. But in the 
present experiments the PPG signals and the acceleration 
signals were downsampled to 25 Hz. Some segments of these 
signals are shown in Fig|2] 

B. Experimental Settings 

As in [4], a time window of 8 seconds was used to slide the 
simultaneous PPG signal and acceleration signals, with a step 
of 2 seconds. HR was estimated in each time window. Before 
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Fig. 6. Flowchart of the peak discovery procedure. 


performing JOSS, all raw signals were bandpass filtered from 
0.4 Hz to 4 Hz using the 2nd order Butterworth filter. 

The Regularized M-FOCUSS algorithm [13] was used to 
estimate the solution matrix of the MMV model, with the pa¬ 
rameter p = 0.8, the regularization parameter A = 10 -10 , and 
the spectrum grid number N = 1024. Its maximum iteration 
number was set to 4. Note that the TROIKA algorithm also 
used the M-FOCUSS algorithm to estimate sparse spectrum of 
PPG signals. However, in TROIKA, the M-FOCUSS algorithm 
was used to estimate the solution of the SMV model B 

In the part of spectral peak tracking, Ai = 15, A 2 = 25, 
and A 3 = 30. The smoothing parameter used in the Peak 
Selection Stage was set to 5 and H = 10, while in the Peak 
Discovery Stage the smoothing parameter was set to 20 and 
K = 30. 


C. Performance Measurement 

The ground-truth HR was calculated from the original raw 
ECG signals (sampled at 125 Hz) by manually picking the R- 
peaks one by one in each time widnow. No R-peak detection 
algorithm was used, in order to avoid any possible detection 
errors. These ground-truth HR values are now available in the 
PPG datasets [4] for performance evaluation. 

2 Since the SMV model is a special case of the MMV model 0 when L = 
1, almost all MMV-model-based algorithms can work on the SMV model. 
But in this case, the benefits of exploiting multiple measurement vectors do 
not exhibit. 


Histogram 



Absolute Error of JOSS - Absolute Error of TROIKA at Each Estimate 


Fig. 7. Histogram of the difference between the absolute error of JOSS and 
the absolute error of TROIKA at each estimate over the 12 datasets. Based 
on the t-test, the absolute estimation error of JOSS was significantly smaller 
than that of TROIKA at the significant level a = 0.01, and the p value was 
6.3 x icr 39 . 


The measurement indexes in [4] were used. One was the 
average absolute error (in BPM), defined as 

1 w 

Error 1 = — ^ |BPM est («) - BPM tr ue(i)| (6) 

i= 1 


where BPM trU e(*) denotes the ground-truth of HR in the i-th 
time window, BPM est (*) denotes the estimated HR, and W 
is the total number of time windows. 

The second was the average absolute error percentage, 
defined as 


1 W 

Error 2 =W^ 

i =1 


BPM est (j) - BPMtrue(t) 
BPM trU e(*) 


(7) 


The third index was the Bland-Altman plot [21]. The 
Limit of Agreement (LOA) was used, which is defined as 
[p — 1.96cr, p + 1.96er], where p is the average difference 
between each estimate and the associated ground-truth against 
their average, and a is the standard deviation. 

Pearson correlation between ground-truth values and esti¬ 
mates was also adopted. 


D. Results 

Table I and Table II list the average absolute error (Errorl) 
and the average absolute error percentage (Error2) of the 
proposed JOSS algorithm on all 12 datasets, respectively 0. To 
show its superiority, it was compared with the recently pro¬ 
posed TROIKA algorithm. For direct comparison, TROIKA 
was also performed on the PPG and acceleration data sampled 
at 25 Hz. The results in Table I and Table II show that JOSS 
had better performance than TROIKA. Averaged across the 
12 datasets, the absolute estimation error (Errorl) of JOSS 

4 : or several datasets, the initial recordings contain strong MA, probably due 
to device adjustment after the recording system turned on. For these recording 
segments, both algorithms did not output HR estimates. Thus the algorithms’ 
performance was not evaluated on these segments. These excluded segments 
were the first 12 seconds of Set 2, the first 8 seconds of Set 3, the first 2 
seconds of Set 4, the first 2 seconds of Set 8, the first 6 seconds of Set 10, 
and the first 2 seconds of Set 11. 
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TABLE I 

Comparison of JOSS and TROIKA in terms of average absolute errors (ErrorI) on the 12 datasets sampled at 25 Hz. 

The unit is BPM. SD denotes standard deviation. 



Set 1 

Set 2 

Set 3 

Set 4 

Set 5 

Set 6 

Set 7 

Set 8 

Set 9 

Set 10 

Set 11 

Set 12 

Average 

JOSS 

1.33 

1.75 

1.47 

1.48 

0.69 

1.32 

0.71 

0.56 

0.49 

3.81 

0.78 

1.04 

1.28 (SD=2.61) 

TROIKA 

2.87 

2.75 

1.91 

2.25 

1.69 

3.16 

1.72 

1.83 

1.58 

4.00 

1.96 

3.33 

2.42 (SD=2.47) 


TABLE II 

Comparison of JOSS and TROIKA in terms of average absolute error percentage (Error2) on the 12 datasets sampled at 25 Hz. SD 

DENOTES STANDARD DEVIATION. 



Set 1 

Set 2 

Set 3 

Set 4 

Set 5 

Set 6 

Set 7 

Set 8 

Set 9 

Set 10 

Set 11 

Set 12 

Average 

JOSS 

1.19% 

1.66% 

1.27% 

1.41% 

0.51% 

1.09% 

0.54% 

0.47% 

0.41% 

2.43% 

0.51% 

0.81% 

1.01% (SD=2.29%) 

TROIKA 

2.18% 

2.37% 

1.50% 

2.00% 

1.22% 

2.51% 

1.27% 

1.47% 

1.28% 

2.49% 

1.29% 

2.30% 

1.82% (SD=2.07%) 



Fig. 8. Estimation results on Dataset 8. The estimated HR trace by JOSS Fig. 9. The Bland-Altman plot of the estimation results on the 12 datasets, 

was almost the same as the ground-truth of HR trace, while TROIKA got The LOA was [—5.94, 5.41] BPM. 

errors sometimes. 


was 1.28 ± 2.61 BPM (mean ± standard deviation), and the 
error percentage (Error2) was 1.01% ±2.29%. In contrast, the 
Error 1 of TROIKA was 2.42 ± 2.47 BPM, and the Error2 was 
1.82% ± 2.07%. 

To better compare the performance of JOSS with TROIKA, 
Fig[7] plots the histogram of the difference between the ab¬ 
solute error of JOSS and the absolute error of TROIKA at 
each estimate over the 12 datasets, i.e., the histogram of 
a(i) — P(i), where a(i) indicates the absolute estimation error 
of JOSS at the ith heart rate estimate, and j3(i) indicates that of 
TROIKA at the ith estimate. Based on the t- test, the absolute 
estimation error of JOSS was significantly smaller than that 
of TROIKA at the significant level a = 0.01, and the p value 
was 6.3 x 10 -39 . 

Fig[8] shows the estimated HR traces of both JOSS and 
TROIKA on Dataset 8 (randomly chosen). JOSS had better 
performance than TROIKA; its estimated HR trace was almost 
the same as the ground-truth of HR trace, while TROIKA 
sometimes got errors. 

FiglU gives the Bland-Altman plot. The LOA was 
[—5.94, 5.41] BPM. The Scatter Plot between the ground-truth 
heart rate values and the associated estimates over the 12 
datasets is given in Fig JTOl which shows the fitted line was 
y = 0.991* ± 0.432, where x indicates the ground-truth heart 


rate value, and y indicates the associated estimate. The Pearson 
coefficient was 0.993. 

IV. Discussions 
A. On the Experiments 

In the experiments the subjects had yellow skin-color and 
the LED light was green. Skin-color and LED light are known 
factors that affect characteristic of PPG signals, and thus affect 
algorithm performance. Evaluating the proposed algorithm’s 
performance at other skin-color and LED light will be the 
future work. 

Studies have shown that the PPG pulse rate variability is 
a surrogate of heart rate variability when subjects are at rest 
[22], However, when body movements occur, the correlation 
between PPG pulse-to-pulse intervals and ECG beat-to-beat 
intervals is not very high, and the correlation varies largely 
at different anatomical measurement sites [3]. Therefore, it 
is more reasonable to compare the ground-truth HR and the 
estimated HR in an ‘average’ level, instead of comparing each 
single cardiac cycle period. In the present experiments, the 
ground-truth HR was calculated from ECG in the time window 
of 8 seconds by the formula 60 C/D (in BPM), where C was 
the number of cardiac cycles in the time window and D was 
the duration (in seconds) of these cycles [4]. And the estimated 
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Correlation r = 0.993 



Fig. 10. Scatter plot between the ground-truth heart rate values and the 
associated estimates over the 12 datasets. The fitted line was y = 0.991 x + 
0.432, where x indicates the ground-truth heart rate value, and y indicates the 
associated estimate. The R 2 value, a measure of goodness of fit, was 0.986. 
The Pearson correlation was 0.993. 

HR was calculated from the spectrum of PPG in the same time 
window. These calculation methods actually obtain averaged 
heartbeat periods in each time window, which is relatively 
resistant to the beat-to-beat interval variability. 

B. Advantages and Possible Improvement Approaches of JOSS 

As shown in the experiments, JOSS works well at low sam¬ 
pling rates. This is a desirable advantage over algorithms that 
only work well at high sampling rates, since a low sampling 
rate means low energy consumption in data acquisition and in 
wireless transmission, which can largely extend battery life in 
wearable devices. 

JOSS exploits a common sparsity structure in the spectra 
of PPG signals and simultaneous acceleration signals by using 
the MMV model in sparse signal recovery. The MMV model 
is known to be superior to the basic sparse signal recovery 
model used in TROIKA. It is worth noting that there are many 
other structures in the spectra of PPG and acceleration signals 
which can be exploited. Thus, more advanced sparse signal 
recovery models may be used to further improve spectrum 
estimation performance and HR estimation performance, such 
as the model exploiting magnitude correlation among spectra 
[15], [23], 

Gridless joint spectral compressed sensing [18], [24] is 
another promising framework to reduce HR estimation errors. 
When HR frequency does not locate at a frequency grid, 
estimation errors are unavoidable, and the PPG spectrum 
calculated from the MMV model may be less-sparse. This 
may result in some difficulties for conventional sparse signal 
recovery algorithms. But gridless joint spectral compressed 
sensing can potentially solve this issue. 

There are other possible approaches to improve HR esti¬ 
mation. One approach is applying smoothing techniques to 
estimated HR traces, such as the median filtering. However, 
most effective smoothing algorithms perform offline. Thus this 
technique may be an effective approach when strict real-time 
is not required. 


Although JOSS does not require an extra noise-removal 
modular, using a noise-removal modular can improve its 
robustness to MA. For example, JOSS can also adopt the signal 
decomposition modular used in TROIKA to partially remove 
MA before joint sparse spectrum reconstruction. Of course, 
this increases processing time, power consumption, and circuit 
design complexity if implemented in VLSI or FPGA. 

V. Conclusion 

In this work a PPG-based heart rate monitoring method 
was proposed for fitness tracking via smart-watches or other 
wearable devices. The method uses the multiple measurement 
vector model in sparse signal recovery to jointly estimate 
sparse spectra of PPG signals and simultaneous acceleration 
signals. Due to the common sparsity constraint on the spectral 
coefficients, identifying and removing spectral peaks of MA 
in PPG spectra is easier. Thus, it does not need an extra signal 
processing stage to remove MA as in other algorithms. This 
largely simplifies the whole algorithm. Besides, it works well 
for data sampled at low sampling rates, thus saving energy 
consumption in data acquisition and wireless transmission. 
Therefore, the proposed method has potential to be imple¬ 
mented in VLSI or FPGA in wearable devices. 
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